Eliminating Implausible Korean Morphological Interpretations by Using History of Previous Analysis and Lexical Association

نویسندگان

  • Bong-Rae Park
  • Young-Sook Hwang
چکیده

A Korean word can be ambiguously analyzed into a lot of interpretations because Korean is a highly innective and agglutinative language. Although conventional Korean morphological analyzers try to eliminate implausible interpretations by using various intra-word connectivity information, there are still some implausible interpretations which can not be eliminated. In this paper, we present a method of eliminating implausible morphological interpretations by using two kinds of inter-word information. The rst information is extracted from the history of previous analysis. We prefer the interpretations including lexical morphemes which can be consistently analyzed from both the current word and its morphologically similar previous words in the history. And the lexical association between two morphemes is used for the second information. Here, we prefer the interpretations which can respect a lexical association between a morpheme and its adjacent morphemes. These lexical associations are automatically collected from a very large raw corpus. The experimental result shows that our system can reduce the word ambiguous rate from 3.50 to 2.73.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Comparative Impact of Pictorial Annotations and Morphological Instruction on Lexical Inferencing of Iranian Intermediate EFL Learners

One of the main ways to acquire unfamiliar words is to make guesses about words meaning. This study investigates the comparative effects of pictorial annotations and morphological instructions on Iranian EFL learners’ lexical inferencing ability. Considering homogeneity issues using PET (Preliminary English Test), the researchers assigned the participants into two experimental and one control g...

متن کامل

An Open-Source Finite State Morphological Transducer for Modern Standard Arabic

We develop an open-source large-scale finitestate morphological processing toolkit (AraComLex) for Modern Standard Arabic (MSA) distributed under the GPLv3 license.1 The morphological transducer is based on a lexical database specifically constructed for this purpose. In contrast to previous resources, the database is tuned to MSA, eliminating lexical entries no longer attested in contemporary ...

متن کامل

رویکردی با ناظر در استخراج واژگان کلیدی اسناد فارسی با استفاده از زنجیره‌های لغوی

Keywords are the main focal points of interest within a text, which intends to represent the principal concepts outlined in the document. Determining the keywords using traditional methods is a time consuming process and requires specialized knowledge of the subject. For the purposes of indexing the vast expanse of electronic documents, it is important to automate the keyword extraction task. S...

متن کامل

Lexical Analysis of Agglutinative Languages Using a Dictionary of Lemmas and Lexical Transducers

This paper presents a simple method for performing a lexical analysis of agglutinative languages like Korean, which have a heavy morphology. Especially, for nouns and adverbs with regular morphological modifications and/or high productivity, we do not need to artificially construct huge dictionaries of all inflected forms of lemmas. To construct a dictionary of lemmas and lexical transducers, f...

متن کامل

Klex: A Finite-State Transducer Lexicon of Korean

This paper describes the implementation and system details of Klex, a finite-state transducer lexicon for the Korean language, developed using XRCE’s Xerox Finite State Tool (XFST). Klex is essentially a transducer network representing the lexicon of the Korean language with the lexical string on the upper side and the inflected surface string on the lower side. Two major applications for Klex ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007